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Abstract 


A  systematic  examination  of  calibration  and  quantitation  methods 
available  to  analytical  chemistry  is  made.  First,  simple  linear  calibration 
with  one  sensor  is  reviewed  with  an  emphasis  on  chemical  problems  that  can 
invalidate  calibration  models  and  what  can  be  done  about  them.  Then,  shift 
is  made  to  multivariate  methods  used  for  multicomponent  analysis  ending  in  a 
discussion  of  bilinear  forms. 


The  general  problems  of  calibration  and  quantitation  are  well  known  to 
analytical  chemists.  The  traditional  approach  to  quantitative  analysis  has 
often  been  to  find  a  single  sensor  which  was  specific  for  the  desired  analyte 
and  responded  in  a  linear  manner  to  changes  in  the  analyte  concentration. 

The  requirement  of  a  fully  selective  sensor  frequently  necessitates  various 
separation  or  purification  steps  prior  to  the  analytical  measurement.  An 
alternative  approach  is  to  use  many  analytical  sensors  and  multivariate  data 
analysis  methods.  The  objective  of  this  review  is  to  provide  the  reader  with 
an  overview  of  multivariate  calibration  and  quantitation  methods  and  to 
discuss  some  of  the  assumptions  inherent  in  applying  these  approaches  to 
analytical  measurements. 

Analytical  Chemistry  has  recognized  the  importance  of  this  problem  by 
establishing  a  report  on  Chemometrics  as  a  part  of  its  biennial  Fundamental 
Reviews  issue.  Sections  of  the  past  Chemometrics  reviews  entitled  "Modeling 
and  Parameter  Estimation",  "Calibration",  and  "Resolution"  are  of  particular 
interest  to  researchers  in  this  area  (1,2). 

This  review  is  grouped  into  three  distinct  sections.  A  review  of  the 
single-component  linear  model  will  provide  the  basis  for  development  of  the 
more  complex  models,  such  as  the  multicomponent  linear  model.  The  latter  is 
based  on  one-dimensional  response  measurements;  for  example,  the  absorption 
spectrum  of  a  mixture  sample.  The  third  section  will  discuss  the  multi- 
component  bilinear  model.  This  model  can  be  used  to  describe  two-dimensional 
chemical  measurements  such  as  fluorescence  excitation-emission  matrices,  gas 
chromatography  -  mass  spectrometry  (GC/MS)  data,  liquid  chromatography  -  UV 
(LC/UV)  data,  or  spatial/spectral  data  as  obtained  from  imaging  in  surface 


Single  Component  Linear  Model 


The  situation  most  favored  by  analytical  chemists  is  when  the  response, 
r,  of  a  single  analytical  sensor  is  a  linear  function  of  the  concentration, 
c,  of  a  single  chemical  analyte  of  interest. 

r  =  k  c  (1) 

In  order  to  obtain  an  estimate  of  the  analyte  concentration,  two  general 
steps  are  required.  First  there  is  a  calibration  step,  in  which  the  sensi¬ 
tivity  coefficient,  k,  is  determined  based  on  the  analysis  of  one  or  more 
samples  of  known  concentrations.  The  second  is  a  quantitation  step,  in  which 
the  response  of  the  unknown  sample  is  measured  and  the  analyte  concentration 
is  estimated  from  the  calibration  model.  Implicit  in  using  this  simple 
linear  model  are  a  series  of  assumptions  about  the  chemical  system  being 
examined:  first,  the  response  is  linearly  related  to  the  analyte  concen¬ 
tration  over  the  concentration  region  of  interest;  second,  the  analytical 
sensor  is  fully  specific  and  responds  only  to  the  analyte  of  interest;  and 
third,  the  sensitivity  coefficient  does  not  change  between  the  calibration 
and  quantitation  steps.  If  these  three  assumptions  are  obeyed  for  a  given 
experimental  situation,  then  the  calibration  line  is  obtained  by  measuring 
the  response  at  various  analyte  concentrations  (3). 

Since  the  analytical  sensor  is  fully  specific,  the  response  of  a  sample 
containing  no  analyte  is  by  definition  equal  to  zero.  This  implies  that  the 
calibration  line  must  pass  through  the  origin.  In  principle,  measurement  of 
a  single  sample  of  known  analyte  concentration  is  sufficient  to  determine  the 
slope  of  the  calibration  line.  In  practice,  several  measurements  are 
preferred.  Even  in  situations  where  a  theoretically  linear  relationship  is 
known  to  exist,  the  measured  experimental  values  will  rarely  be  co- 1  inear 
with  theoretical  values  due  to  sample  variance,  measurement  errors,  and 
random  noise. 


Random  Error 


The  method  of  least  squares  is  commonly  used  to  estimate  the  position  of 
the  calibration  line.  The  mathematical  formula  for  calculation  of  the  least 
squares  regression  line  and  the  confidence  region  around  this  line  are  well 
known  (4,5).  Several  additional  assumptions  are  required  when  the  method  of 
least  squares  is  used  to  estimate  the  calibration  relationship  (6).  The 
first  assumption  is  that  all  the  measurement  error  is  associated  with  the 
dependent  variable,  the  measured  response.  This  condition  requires  the 
variance  in  the  concentrations  of  the  standard  samples  to  be  much  smaller 
than  the  variance  in  the  corresponding  measured  responses.  Secondly,  each 
measured  response  is  drawn  from  a  normal  distribution  with  a  mean  equal  to 
the  true  response  for  the  corresponding  analyte  concentration.  This  requires 
that  repeated  measurements  of  the  response  for  a  single  sample  yield  a 
Gaussian  distribution.  Lastly,  the  variance  of  the  measured  response  must  be 
independent  of  the  analyte  concentration,  or  in  statistical  terms  there  must 
be  homogeneity  of  variances. 

If  M  calibration  samples  have  been  analyzed,  with  each  calibration 
sample  being  measured  one  or  more  times  such  that  N  total  calibration 
measurements  were  made,  then  the  calibration  step  requires  estimating  the 
values  of  two  parameters;  the  slope,  /7^,  and  the  intercept,  /Jq,  of  the  least 
squares  line.  For  the  single-component  linear  model,  the  least  squares 
problem  can  be  expressed  as  minimizing  the  sum  of  the  squares  of  the 
residuals  in  the  following  vector  equation: 

r  =  P0  *  ?1C  *  e  (2) 

where  r  is  a  column  vector  containing  the  N  measured  responses,  c  is  a  column 


vector  containing  the  N  known  analyte  concentrations,  and  e  is  the  vector  of 
residual  errors  not  fitted  by  the  model. 


Shewel I  (7)  has  observed  that  varying  the  location  of  the  calibration 
points  will  have  an  effect  on  the  accuracy  of  the  estimates  obtained  for  the 
slope  and  the  intercept  of  the  regression  tine.  In  general,  for  a  constant 
total  number  of  calibration  measurements,  N,  the  most  accurate  estimate  of  the 
intercept,  p q,  is  obtained  if  N-l  measurements  are  made  at  the  lowest  permis¬ 
sible  analyte  concentration  and  one  measurement  is  made  at  the  highest  per¬ 
missible  analyte  concentration.  If  the  most  accurate  estimate  of  the  slope  is 
desired,  then  the  measurements  should  be  equally  divided  between  the  highest 
and  lowest  permissible  concentration  levels. 


Agterdenbos  (8)  considered  the  effect  of  altering  the  concentrations  of 
calibration  samples  on  the  precision  obtained  in  the  final  concentration 
estimate.  Both  the  distribution  of  the  calibration  measurements  over  the 
concentration  range  of  interest  and  the  number  of  replicate  measurements  were 
found  to  influence  the  results  obtained  in  the  subsequent  quantitation  step. 

A  new  quantity,  the  eccentricity,  can  be  defined  to  describe  the  relationship 
between  the  precision  of  the  estimated  sample  concentration  and  locations  of 
the  calibration  points.  The  precision  of  the  estimated  sample  concentration, 
Ax,  is  a  function  of  several  parameters:  the  selected  statistical  significance 
level,  t;  the  standard  deviation  of  the  analytical  procedure,  s;  the  total 
number  of  calibration  measurements,  N;  the  number  of  replicate  measurements  of 
the  sample,  n;  and  the  location  of  the  sample  measurement  in  the  calibration 
range  or  the  eccentricity,  E. 


ilx  t  2  t  s  (N  ^  ♦  n  ^ 
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The  eccentricity,  E,  can  be  calculated  from  the  following  relationship: 

E  =  ( x  -  x)2  /  E  (x;  -  7)2  (4) 

s  i=l  ' 

A 

where  xg  is  the  mean  estimated  analyte  concentration,  and  x  is  the  mean 
concentration  of  the  calibration  samples.  This  is  equivalent  to  the  center 
of  gravity  of  the  calibration  plot.  From  these  relationships  it  is  clear 
that  the  minimum  uncertainty  in  the  estimated  analyte  concentration  will 
occur  when  the  sample  concentration  is  equal  to  the  mean  concentration  of 
the  calibration  samples,  in  which  case  the  eccentricity  is  equal  to  zero. 

As  the  estimated  sample  concentration  moves  to  a  value  further  from  the 
center  of  gravity  of  the  calibration  plot  the  eccentricity  increases  and  the 
precision  of  the  estimated  sample  concentration  becomes  poorer. 

In  many  analytical  procedures,  the  assumption  of  homogeneity  of  vari¬ 
ances  may  be  false.  For  example,  the  precision  obtained  in  spectrophotome¬ 
try  may  be  limited  by  the  measurement  readout  error,  detector  shot  noise,  or 
source  flicker  noise  (9).  The  classical  approach  to  estimating  measurement 
precision  has  assumed  that  the  readout  error  of  a  linear  transmittance  scale 
is  the  dominant  factor;  in  modern  instruments  it  is  far  more  likely  that  the 
dominant  factor  will  be  the  photomultiplier  shot  noise. 

Agterdenbos  (6)  has  suggested  that  chemists  give  more  care  to  the 
proper  selection  of  the  calibration  relationship  being  used  when  performing 
a  least  squares  analysis.  One  method  for  obtaining  the  calibration  line 
when  the  precision  of  the  response  measurement  is  dependent  on  the  analyte 
concentration  is  to  use  a  weighted  least  squares  procedure.  Weighted  least 
squares  regression  is  analogous  to  ordinary  least  squares.  Both  methods  are 
based  on  minimizing  the  sum  of  the  squared  deviations  between  the  actual 


responses  and  the  calibration  line.  However,  when  weighted  least  squares  is 
used,  each  residual  is  multiplied  by  a  weighting  factor,  w.,  proportional  to 
the  reciprocal  of  the  variance  of  the  corresponding  response  measurement, 
r..  The  relationship  to  be  minimized  is  now  given  as 

h  -  4ci  -  />i>  2 

Schwartz  (10)  has  illustrated  the  potential  for  nonuniform  variance  in 
both  spectroscopic  and  chromatographic  experiments.  He  concluded  that  if 
the  analyst  ignores  variance  nonuniformity,  roughly  the  same  calibration 
curve  will  be  obtained.  However,  the  confidence  bands  around  the  estimated 
analyte  concentration  may  be  severely  in  error  at  the  extremes  of  the 
calibration  curve.  Garden  and  co-workers  (11)  have  shown  how  weighted  least 
squares  procedures  can  improve  the  precision  of  the  estimated  analyte 
concentration  when  compared  to  ordinary  least  squares  calibration. 

Deterministic  Error 

In  addition  to  the  statistical  errors  which  may  arise  from  the  improper 
application  of  least  squares  methods,  the  single-component  linear  model  may 
also  be  affected  by  various  types  of  deterministic  errors.  In  most  cases 
these  deterministic  errors  can  be  traced  to  violation  of  the  initial 
assumptions  underlying  the  original  model.  Often  if  the  source  of  the  error 
can  be  identified,  the  calibration  model  can  be  adjusted  to  bring  into 
consideration  the  effects  of  these  additional  factors. 


One  serious  problem  which  occurs  frequently  in  analytical  chemistry  is 
the  presence  of  a  sample  matrix  effect.  This  can  be  defined  as  a  difference 
in  the  sensitivity  coefficient,  k;  between  the  unknown  sample  being  analyzed 
and  the  calibration  standards.  The  interaction  of  the  analyte  with  the 
sample  matrix  results  in  a  change  in  the  slope  of  the  calibration  plot. 

The  method  of  standard  addition  is  widely  used  in  analytical  chemistry 
to  address  this  particular  type  of  problem.  The  response  of  the  unknown 
sample  is  measured,  a  known  amount  of  the  pure  analyte  is  added  to  the 
sample,  and  then  the  response  of  the  sample  after  this  addition  is  measured. 
The  initial  response  measurement,  r^,  depends  only  on  the  unknown  concentra¬ 
tion,  Cq.  After  the  addition  is  made  the  response  is  a  function  of  both  the 
original  analyte  present  and  the  amount  added.  The  matrix-corrected  sensi¬ 
tivity  coefficient  is  obtained  from  the  change  in  the  response  due  to  the 
addition  of  pure  analyte.  This  method  still  requires  the  response  to  be 
linearly  related  to  the  concentration  and  the  sensor  to  be  totally  specific 
for  the  analyte  of  interest. 

Different  groups  of  workers  have  applied  statistical  techniques  to 
calculate  the  optimum  method  of  making  the  additions  and  the  resulting 
precision  in  the  estimate  of  the  analyte  concentration  (12,13).  The  optimal 
size  of  the  addition  to  be  made  is  a  function  of  the  precision  of  the  sensor 
and  the  form  of  the  calibration  function.  Franke,  de  Zeeuw,  and  Hakkert  (14) 
concluded  that  if  a  single  addition  is  made,  then  optimum  precision  is 
obtained  by  making  an  addition  of  the  largest  possible  amount  of  standard 
without  exceeding  the  linear  range  of  the  calibration  curve  and  making  an 
equal  number  of  replicate  measurements  before  and  after  the  addition. 

The  single-component  linear  model  assumes  that  the  analytical  sensor 
posesses  total  specificity  for  the  analyte  of  interest.  Implicit  in  this 


assumption  is  a  requirement  that  the  response  of  the  sensor  at  zero  analyte 
concentration  is  zero  response  units,  or  simply  stated,  the  sensor  can  be 
zeroed.  Two  types  of  problems  may  lead  to  failure  of  this  assumption: 
first,  an  instrumental  or  constant  background;  and  second,  a  sample-  or 
volume-dependent  background. 

An  instrumental  background  will  result  in  the  addition  of  a  constant 
non-vol ume-dependent  term,  d,  to  the  simple  linear  model. 

r  =  k  c  +  d  (6) 

In  a  spectophotometr i c  analysis,  this  constant  term  may  arise  from  the  use 
of  mismatched  optical  paths  or  cells,  temperature  differences  between  the 
sample  and  calibration  solutions,  amplifier  offsets,  or  similar  problems. 

An  instrumental  background  will  cause  a  bias  in  the  concentration  estimate 
obtained  from  either  a  normal  calibration  or  a  standard  addition 
experiment.  Fortunately,  this  type  of  background  can  be  handled  by 
standard  dilution. 

A  significantly  more  difficult  problem  is  the  presence  of  a  sample 
background.  In  this  situation  the  sensor  no  longer  posesses  complete 
specificity,  but  responds  both  to  the  analyte  of  interest  and  also  one  or 
more  other  components  present  in  the  sample.  This  problem  has  given  rise 
to  a  multitude  of  separation  and  purification  techniques  directed  at 
eliminating  potential  interferences.  If  the  identities  of  the  additional 
components  are  known  and  standards  of  these  components  are  available,  then 
the  situation  can  be  treated  as  a  multicomponent  analysis  problem. 

However,  if  the  identities  of  any  of  the  inter ferents  are  not  known  or  if 
calibration  with  these  components  is  not  possible,  then  the  situation 
represents  a  sample-  or  volume-dependent  background  problem. 


Multi -Component  Linear  Model 


The  multicomponent  linear  model  is  simply  a  generalization  of  the 
familiar  single-component  model.  The  responses  due  to  each  of  the 
components  present  in  the  sample  are  assumed  to  add  linearly,  or  can  be 
transformed  to  yield  the  total  response  for  any  analytical  sensor.  For  the 
case  of  two  sensors,  which  respond  to  both  of  two  analytes,  a  system  of  two 
equations  is  obtained.  This  can  be  written  as 


2 

rl  =Ecikil  =  Clkll  +  c2k21 
i=l 

(7) 

2 

r2  =. E  c i k i 2  =  Clk12  +  C2k22 

i=l 


where  Tj  is  the  response  measured  at  the  j-th  sensor,  c.  is  the  concentra¬ 
tion  of  the  i-th  analyte,  and  k.  j  is  the  sensitivity  coefficient  of  the 
j-th  sensor  for  the  i-th  analyte.  Each  equation  represents  the  measured 


response  for  a  single  analytical  sensor  as  the  sum  of  the  responses  due  to 
the  individual  components.  For  a  mixture  of  N  components,  this  model  is 
generally  written  in  matrix  form  as  follows 


r’  =  c’K 


(8) 


The  vector  r  is  a  column  vector  containing  the  response  of  the  sample  mea¬ 
sured  with  P  different  sensors.  The  vector  c  is  a  column  vector  containing 
the  concentrations  of  the  N  analytes  present  in  the  sample.  The  prime 
denotes  the  transpose  of  a  matrix  or  vector.  The  matrix  K  contains  the 
sensitivity  coefficients  for  the  N  components  at  each  of  the  P  analytical 
sensors.  Each  row  of  the  K  matrix  contains  the  P  sensitivity  coefficients  of 
a  single  analyte.  Each  column  of  the  K  matrix  contains  the  sensitivity 
coefficients  of  all  N  components  for  the  same  analytical  sensor. 
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Several  assumptions  are  implicitly  made  when  the  multicomponent  linear 
model  is  used.  These  assumptions  are  analogous  to  the  assumptions  made  with 
the  single-component  linear  model.  First,  the  response  of  each  sensor  is 
assumed  to  be  linearly  related  to  all  analyte  concentrations  over  the  con¬ 
centration  ranges  of  interest.  Second,  the  response  due  to  each  component 
present  in  the  mixture  sample  is  independent  of  the  other  N-l  components. 
Third,  the  response  of  each  sensor  can  be  zeroed.  Lastly,  the  sensitivity 
coefficients  do  not  change  between  the  calibration  and  quantitation  steps. 

Classification  of  Samples 

Martens  et.al.  (15)  proposed  the  classification  scheme  for  multicom¬ 
ponent  mixtures  shown  in  Table  I.  Mixture  samples  in  which  the  individual 
component  concentrations  are  known  will  be  designated  as  class  1  samples; 
those  samples  in  which  the  component  concentrations  are  not  known  will  be 
designated  as  class  2  samples.  Class  1  samples  are  further  grouped  into  two 
types.  The  first  group  is  class  1A.  These  samples  are  fully  defined,  both 
the  individual  component  concentrations  and  the  pure  component  sensitivities 
are  known  to  the  analyst.  In  the  second  group,  class  IB,  the  individual 
component  concentrations  are  known  but  the  pure  component  sensitivities  are 
unknown.  If  N  components  are  present  in  the  mixture,  then  estimation  of  the 
individual  component  sensitivities  requires  either  N  pure  samples  or  N 
mixtures  of  known  composition.  This  class  of  samples  represents  the  general 
problem  of  multicomponent  calibration. 

Class  2  samples  are  also  grouped  into  two  types.  The  first  group,  class 
2A,  are  samples  in  which  the  component  concentrations  are  unknown  but  the 
individual  pure  component  sensitivities  are  known.  This  type  of  sample  is 
representive  of  a  multicomponent  quantitation  problem.  The  second  group, 
class  2B  samples,  are  mixtures  in  which  neither  the  individual  component 
concentrations  nor  their  sensitivities  are  known  by  the  analyst. 


Analysis  of  a  single  class  2B  sample  is  not  possible.  However,  if  a  set 
of  class  2B  samples  are  available  in  which  the  '-elative  concentrations  of  the 
components  varies  from  sample  to  sample,  then  the  methods  appropriate  to  the 
multicomponent  bilinear  model  may  be  used  to  obtain  regions  of  physically 
allowable  pure  component  sensitivities  and  concentration.  An  unambiguous 
solution  for  the  sensitivities  and  individual  component  concentrations  is  not 
possible  unless  the  analytical  chemist  can  obtain  further  information  about 
the  samples. 

Cal i brat  ion 

Several  different  methods  are  available  for  calibration  in  a  multicom¬ 
ponent  analysis.  Kaiser  (16)  has  grouped  calibration  methods  into  three  main 
approaches:  1)  0-calibration,  or  calibration  with  synthetic  standards;  2)  a- 
calibration,  or  calibration  with  analyzed  standard  samples;  and  3)  5-calibra¬ 
tion,  or  calibration  by  differential  additions.  Of  these  three  methods,  cal¬ 
ibration  with  synthetic  standard  samples  is  in  Kaiser’s  view  the  most  funda¬ 
mental.  Given  a  mixture  of  N  components  whose  response  can  be  measured  at 
each  of  P  different  analytical  sensors,  the  question  arises  of  selecting  the 
best  method  for  first  performing  the  calibration  and  then  estimating  the  N 
analyte  concentrations.  If  samples  of  the  N  pure  components  are  available, 
then  the  simplest  method  of  obtaining  the  sensitivity  coefficients  is  to 
individually  measure  the  response  of  each  pure  compound.  This  method  may  not 
always  be  possible.  In  some  cases  the  pure  substances  may  be  very  difficult 
or  expensive  to  obtain  and  purify  or  the  mixture  may  include  analytes  which 
are  unstable  in  purified  form.  If  the  pure  analytes  are  not  available,  but  it 


is  possible  to  obtain  a  set  of  pre-analyzed  standard  samples,  then  the  cali¬ 
bration  can  be  based  on  comparison  to  these  standard  mixtures.  Finally,  if 
matrix  effects  are  known  to  be  present,  the  most  appropriate  method  is  to  use 
a  standard  addition  analysis  to  allow  calibration  within  the  sample  matrix. 

Multicomponent  calibration  based  on  the  analysis  of  a  set  of  mixture 
samples  of  known  analyte  concentrations  to  obtain  the  calibration  relation¬ 
ship  is  frequently  used.  If  a  well  characterized  set  of  M  mixture  samples  is 
obtainable,  the  sensitivity  coefficients  can  be  obtained  by  ordinary  or 
weighted  least  squares  multiple  regression.  The  normal  representation  of 
this  problem  is  as  follows 

R  =  C  K  (9) 

where  R  is  an  M  x  P  matrix  of  measured  responses  for  each  of  the  M  mixture 
samples,  C  is  an  M  x  N  matrix  containing  the  N  analyte  concentrations  for 
each  of  the  mixtures,  and  K  is  defined  as  before.  Since  the  concentrations 
of  all  of  the  analytes  in  each  mixture  sample  are  known  and  the  mixture 
responses  can  be  measured,  the  sensitivity  matrix,  K,  can  be  obtained  by 
multiplying  both  sides  of  equation  9  by  the  inverse  of  C.  If  there  are  the 
same  number  of  mixture  samples  as  analytes  present,  i.e.  M  =  N,  then  this 
system  of  linear  equations  is  exactly  determined  and  the  calibration  step 
requires  inverting  C  to  yield 

K  =  C'*R  (10) 

However,  if  the  number  of  calibration  mixtures  used  is  greater  than  the 
number  of  analytes,  i.e.  M  >  N,  then  the  best  estimate  of  the  sensitivity 
matrix,  K,  is  generally  calculated  from  least  squares  multiple  regression  in 


matrix  form.  The  generalized  inverse  solution  for  the  sensitivity  matrix  is 
given  as 

K  =  (C  *C)  —1C  *R  (11) 

In  order  to  obtain  the  sensitivity  matrix,  K,  from  either  equation  10  or  11, 
the  number  of  analytical  sensors,  P,  must  be  greater  than  or  equal  to  the 
number  of  analytes.  The  analyte  concentrations  in  a  unknown  mixture  sample 
can  be  obtained  by  measuring  its  response  and  multiplying  the  transposed 
response  vector  by  the  generalized  inverse  of  K, 

c*  =  r’K’(KK’)'1  (12) 

For  the  entire  analysis,  calibration  and  quantitation  of  N  analytes,  this 
procedure  requires  at  least  N  mixture  samples,  and  inversion  of  two  N  x  N 
matrices;  C’C  and  KK’. 

Brown  and  associates  (17,18)  have  proposed  an  alternative  formulation  of 
the  matrix  multicomponent  model,  where  instead  of  considering  the  response  as 
a  function  of  concentration,  they  consider  the  concentration  a  function  of  the 
measured  response.  This  is  written  as 

C  =  R  P  (13) 

where  the  matrices  C  and  R  are  defined  as  before  and  the  matrix  P  represents 
the  proportionality  between  C  and  R.  The  matrix  P  will  have  dimensions  of  P 
x  N,  i .e.  sensors  by  analytes.  For  this  model,  the  calibration  step  is 
expressed  as 


which  requires  the  inversion  of  the  P  x  P  matrix,  R’R.  Quantitation  of  an 
unknown  sample  is  accomplished  directly  by  multiplying  the  response  vector, 
r,  by  the  calibration  matrix,  P,  to  yield 


c  *  =  r  *P 


(15) 


The  authors  state  that  this  method  has  the  advantage  of  requiring  only 
one  matrix  inversion  instead  of  the  two  required  by  the  conventional  nota¬ 
tion.  Subsequently,  they  used  this  method  for  the  spectrophotometr ic  analy¬ 
sis  of  serum  lipids  with  85  calibration  samples  and  15  analytical  wavelengths 
(19). 

The  difficulity  in  this  analysis  lies  in  the  relative  dimensions  of  the 
various  matrices.  In  order  to  obtain  a  solution  of  equation  14  there  must  be 
more  calibration  mixtures  than  sensors  being  used.  This  is  a  drawback  when 
the  availability  of  diode  array  spectrophotometers  makes  it  possible  to 
measure  256  or  more  wavelengths  as  easily  as  four  or  five.  In  order  to  use 
all  of  the  available  wavelengths,  one  calibration  sample  must  be  prepared  for 
every  sensor  used.  Additionally,  as  the  number  of  calibration  samples  and 
wavelengths  are  increased  the  size  of  the  matrix  R’R,  which  must  be  inverted 
in  the  calibration  step,  is  also  increased.  If  the  sensors  themselves  are 
highly  correlated  or  if  the  number  of  analytes  is  much  less  than  the  number 
of  sensors,  then  the  matrix  R’R  may  have  an  effective  rank  of  much  less  than 
P.  In  this  situation,  R’R  will  be  almost  singular  and  the  inversion  will  be 
numer i ca I  I y  unstab  I e . 


4 
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The  method  of  principal  component  regression  can  be  used  as  an  alterna¬ 
tive  to  ordinary  least  squares  multiple  regression  (20).  This  method  is 
based  on  replacing  the  M  x  P  response  matrix,  R,  with  the  product  of  two 
smaller  matrices,  T  and  B.  Equation  13  can  now  be  written  as 


CsTBP 


(16) 


Page  14 


t 


where  the  matrix  T  has  dimensions  M  x  A  and  the  matrix  B  has  dimensions  A  x  P, 
with  A  <<  N  and  A  «  P.  Decomposition  of  R  into  the  matrices  T  and  B  is 
called  singular  value  decomposition,  eigenvector  projection,  factor  analysis, 
or  principal  component  analysis,  depending  on  the  scaling  of  R.  The  matrices 
T  and  B  are  selected  in  order  to  represent  R  as  closely  as  possible  and  such 
that  the  columns  of  T  and  the  rows  of  B  are  both  orthogonal.  Geometrically, 
the  decomposition  of  R  into  T <-B  can  be  considered  as  a  projection  of  the 
original  data  points,  or  mixture  spectra,  from  a  P-dimensional  measurement 
space  into  a  smaller  A-d i mens iona I  space.  The  matrix  T,  whose  elements  are 
sometimes  called  the  factor  scores,  contains  the  coordinates  of  the  data 
points  in  the  new  A-d i mens iona I  space  and  the  matrix  B,  containing  the  factor 
loadings,  is  the  rotation  matrix  used  to  perform  the  projection.  Solution  of 
the  original  calibration  problem  now  requires  the  inversion  of  T’T  instead  of 
R’R.  Since  the  columns  of  T  are  orthogonal,  this  inversion  is  numerically 
well  conditioned.  This  yields  a  calibration  matrix  G  instead  of  the  calibra¬ 
tion  matrix  P. 

G  =  B  P  =  (T»T)-1T’C  (17) 

The  desired  calibration  matrix  P  can  then  easily  be  found  as 

P  =  B’G  (18) 

The  quantitation  step  is  exactly  the  same  as  used  by  Brown  and  co-workers  (17,18) 

A  different  approach  to  the  multicomponent  calibration  problem,  called 
partial  least  squares  in  latent  variables  (PLS) ,  has  been  suggested  by  S. 

Wold  and  co-workers  (21).  PLS  was  developed  by  H.  Wold  (22)  to  solve  complex 
data  analysis  problems  in  econometrics  and  psychometrics.  It  is  somewhat 
analogous  to  principal  component  multiple  regression  in  that  the  independent 


variables,  in  this  case  the  matrix  R,  are  described  by  a  principal  component 
type  model  and  then  combined  with  a  regression  relationship  relating  the 
responses  to  the  analyte  concentrations  contained  in  the  matrix  C.  The  dif¬ 
ference  is  that  in  the  PLS  method  the  projection  T  is  computed  not  only  to 
model  R  but  also  to  maximize  its  correlation  with  C.  In  principal  component 
regression,  T  is  selected  only  to  model  R.  The  PLS  method  involves  first 
scaling  both  the  response  matrix,  R,  and  the  concentration  matrix,  C,  such 
that  the  standard  deviation  of  each  column  in  these  matrices  is  equal  to  one 
The  matrices  are  then  centered  by  subtracting  the  average  for  each  column. 
Each  matrix  is  then  modeled  as  a  linear  combination  of  new  orthogonal  latent 
variables.  The  latent  variables  are  calculated  by  an  iterative  method  which 
does  not  involve  an  explicit  regression  step.  The  maximun  number  of  latent 
variables  is  the  actual  number  of  independent  variables;  however,  normally 
fewer  latent  variables  are  used  to  allow  filtering  of  the  noise  present  in 
the  data  set.  The  PLS  model  is  described  as  follows 
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u | |  =  p |tj|  for  all  1=1, ...A  (21) 

where  u.  .  and  t.  ,  are  the  latent  variables  and  b.  ■  and  d.  ,  are  the 
i,l  i,l  i  ,j  i  ,k 

loadings  used  to  describe  the  concentration  and  response  matrices, 
respectively.  Equations  19  and  20,  known  as  the  outer  relationship, 
describe  the  projection  of  the  original  variables  into  an  A-dimensional 
space.  Equation  21,  known  as  the  inner  relationship,  describes  the 
correlation  between  the  latent  variables.  The  quantitation  steD  in  PLS  is 
accomplished  by  first  centering  and  scaling  the  measured  response  spectrum 
of  an  unknown  mixture,  calculating  the  latent  variables,  t.  ., 


from  the 


loadings,  b.  .  ,  calculating  the  latent  variables,  u-  .,  from  the  inner 

i ,  j  i  .  i 

relationship,  and  then  estimating  the  concentrations  from  equation  19. 

The  PLS  method  has  been  compared  to  principal  component  regression  for  the 
multicomponent  calibration  and  quantitation  of  spectrof I uor imetr i c  data 
from  mixtures  of  humic  acid  and  I ign insulfonate  by  Lindberg  and  co-workers 
(23).  They  concluded  that:  first,  PLS  was  computationally  faster  than 
principal  component  regression;  second,  PLS  calibrations  have  better 
predictive  qualities  since  the  method  extracts  information  which  has 
predictive  relevance  for  the  concentrations  of  the  calibration  mixture; 
third,  a  criterion  could  be  established  for  determining  if  the  calibration 
model  was  appropriate  for  a  given  unknown  mixture;  and  fourth,  like  other 
methods  based  on  principal  component  analysis,  PLS  was  able  to  compensate 
for  unidentified  fluorescent  species  in  the  solution.  This  final  conclu¬ 
sion  implies  that  an  analyte  can  be  quantitated  in  the  presence  of  a 
totally  unknown  background,  but  the  experimental  data  reported  does  not 
support  this  conclusion. 


It  was  already  noted  that  matrix  effects  can  affect  the  accuracy  of 
the  calibration  in  a  single  component  analysis.  Exactly  the  same  difficul¬ 
ties  may  arise  with  the  multicomponent  linear  model.  As  Kaiser  (16) 
noted,  standard  additions  provide  the  most  appropriate  calibration  method 
if  matrix  effects  occur.  When  discussing  the  single  component  linear 
model,  it  was  observed  that  in  most  cases,  the  well  known  standard  addi¬ 
tion  method  was  able  to  correct  for  these  matrix  effects,  but  the  simple 
standard  addition  method  required  a  fully  selective  sensor.  Saxberg  and 
Kowalski  (24)  have  developed  a  multicomponent  extension  of  the  standard 
addition  method  which  they  named  the  generalized  standard  additon  method 
or  GSAM.  The  generalized  standard  addition  method  has  two  distinct 
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advantages:  first,  it  allows  the  use  of  non-selecti ve  analytical  sensors; 
and  second,  it  corrects  for  the  presence  of  matrix  effects.  The  first 
advantage  is  a  byproduct  of  the  multicomponent  nature  of  the  method.  This 
does  not  require  the  individual  sensors  to  be  fully  selective  for  any  one 
analyte,  however  it  does  require  that  the  sensors  do  not  respond  to 
components  in  the  sample  of  which  additions  have  not  been  made.  The 
second  advantage  is  the  result  of  using  standard  additions  and  making  all 
the  measurements  within  the  sample  matrix  in  order  to  obtain  the  sensiti¬ 
vity  matrix,  K.  The  response  of  each  sensor  is  normally  assumed  to  obey 
the  linear  multicomponent  model  given  in  equation  8,  but  models  involving 
higher  dimension  polynomial  relationships  between  the  concentration  and 
absorbance  were  described. 


Experimentally,  GSAM  requires  that  M  additions  are  made  to  the  sample 
being  analyzed.  Each  addition  may  contain  one  or  more  of  the  pure  anal¬ 
ytes,  however,  the  additions  must  be  made  such  that  each  pure  analyte  is 
added  to  the  sample  at  least  once.  After  each  addition  the  response  at 
each  of  P  analytical  sensors  is  measured.  The  response  after  the  m-th 
addition  is  modeled  as 


r’  =  c*  K 
m  m 


where  r  is  a  column  vector  containing  the  measured  responses,  and  c  is  a 
in  m 

column  vector  containing  the  total  concentrations  of  analyte  present  (c^  * 
Ac).  The  response  matrix,  R,  and  the  concentration  matrix,  C,  are  defined  as 


This  allows  a  simple  formulation  of  the  problem  as 


R  =  C  K  (25) 

Each  row  of  R  and  C  corresponds  to  a  separate  multiple  standard  addition. 

The  matrix  R  is  always  known.  The  matrix  C  is  unknown  since  each  row 
includes  the  unknown  analyte  concentration  plus  the  amount  of  analyte  which 
has  been  added.  The  matrix  of  sensitivity  coefficients,  K,  is  also  unknown. 
Solution  of  this  linear  multiple  linear  system  is  accomplished  by  separating 
the  terms  as  follows 


C  =  AC  ♦  C0 

(26) 

R  =  AR  +  Rq 

(27) 

where  Cq  and  Rg  are  matrices  with  all  rows  identical  to  the  initial  concen¬ 
tration  and  initial  response  vectors  Cq  and  r^,  respectively,  and  AC  and  AR 
are  the  matrices  of  the  net  change  in  concentrations  and  responses  due  to  the 
standard  additions.  AC  and  AR  are  always  known  to  the  analyst,  hence  the 
sensitivity  coefficients  can  be  calculated  from 

AR  =  AC  K  (28) 

The  calibration  step  in  GSAM  is  equivalent  to  the  solution  of  this  linear 
system.  Assuming  N  /  M,  the  solution  is  found  by 

K  =  (AC’AQ^AC’AR  (29) 

or  if  N  =  M,  then  AC  can  be  inverted  directly.  The  quantitation  step  is 
given  by  equation  12.  This  is  identical  to  the  earlier  discussion  of  the 
least  squares  matrix  solution  of  the  multicomponent  model.  It  must  also  be 
noted  that  the  matrix  AC  contains  the  effective  concentration  changes  after 


each  standard  addition.  Unless  the  volume  changes  are  negligible,  AC  cannot 
be  known  since  the  initial  concentrations  are  not  known.  This  problem  can  be 
avoided  by  incorporating  a  simple  volume  correction  into  the  GSAM  to  convert 
from  analyte  concentrations  to  absolute  quantities.  Equation  22  is  now 
written  as 


r*  =  (1/V  )n’K 
m  v  '  m'  m 


where  the  vector  n  contains  the  absolute  quantities,  in  grams  or  moles,  of 
m 

each  analyte  in  a  volume,  V  .  Multiplying  both  sides  of  this  equation  by  Vm 
leads  to 


q_  =  V  r 
Tit  m 


where  the  vector  q  contains  the  P  volume  corrected  responses.  The  remaining 
m 

equations  are  obtained  by  substituting  the  volume  corrected  responses;  q,  Q, 
and  AQ,  for  the  responses;  r,  R,  and  AR,  and  by  substituting  the  absolute 
quantities;  n,  N,  and  AN,  for  the  concentrations;  c,  C,  and  AC. 


Several  more  recent  papers  have  examined  the  error  propagation  and 
statistical  aspects  of  using  the  GSAM.  Jochum,  Jochum,  and  Kowalski  (25)  have 
stated  the  accuracy  of  GSAM  in  obtaining  valid  estimates  of  the  initial 
analyte  concentrations  is  dependent  on  at  least  five  distinct  factors:  first, 
the  accuracy  of  the  response  measurements;  second,  the  accuracy  and  precision 
of  the  multiple  standard  additions;  third,  the  magnitude  of  the  interanalyte 
response  interferences;  fourth,  the  experimental  design;  and  finally,  the 
mathematical  algorithims  used  in  the  computations.  The  first  two  of  these 
factors  are  no  different  than  the  considerations  required  for  any  analytical 
method.  The  final  factor,  selection  of  the  mathematical  algorithims,  can 
affect  the  results  by  introducing  round-off  errors  into  the  computations. 
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The  upper  bound  on  relative  errors  in  the  estimated  concentrations  was  found 
to  depend  on  the  condition  number  of  both  the  calibration  matrix,  K,  and  the 
experimental  design,  which  is  described  by  the  addition  matrix,  AN.  The 
condition  number  of  any  nonsingular  matrix  A  is  defined  as 

cond(A)  =  MAI  I  I  I  A-1 1  I  (32) 

where  IIAII  is  the  Euclidian  norm  of  the  matrix  A.  If  the  matrix  A  is 
rectangular,  then  its  condition  number  is  given  as 

cond(A)  =  [cond(A’A)]1/2  (33) 

It  is  important  to  note  that  the  condition  number  of  any  matrix  is  always 
equal  to  or  greater  than  one.  In  the  GSAM  experiment,  the  K  matrix  is 
determined  by  the  solution  of  an  overdetermined  system  of  linear  equations 
and  therefore  this  matrix  is  not  exactly  known.  Jochum,  Jochum,  and 
Kowalski  showed  that  errors  in  the  response  measurements  can  be  amplified  by 
the  chemist’s  choice  of  experimental  design.  An  estimate  of  the  error  in 
the  calculated  K  matrix  was  found  to  be 

llk-kll  IlSq.  -  Aq. I  I 

-  <  cond(AN)  - - - - -  (34) 

llkll  -  I  I Aq^  II 

where  Aq^  and  Aq^  are  the  projections  of  Aq  and  Aq  onto  the  range  of  N.  A 
modification  of  the  computational  algorithim,  called  the  incremental  dif¬ 
ference  calculation,  was  described  which  minimized  the  error  amplification 
due  to  the  experimental  design.  In  the  incremental  difference  calculation 
the  AQ  matrix  is  composed  of  the  change  in  volume  corrected  response  between 
two  successive  additions  and  the  AN  matrix  is  composed  of  the  absolute  quan¬ 
tity  of  analyte  added  in  a  single  addition.  After  scaling  the  condition 
number  of  the  AN  matrix  is  equal  to  one,  which  results  in  no  error  amplifi¬ 
cation  being  introduced  in  the  final  concentration  estimates  due  to  the 


experimental  design. 

The  condition  number  of  the  K  matrix  can  also  lead  to  a  magnification  of 
the  potential  concentration  errors.  The  authors  showed  that,  in  the  worst 
case,  a  small  relative  error  in  the  initial  response  vector,  r^,  could  be 
magnified  by  the  cond(K)  to  produce  a  larger  relative  concentration  error. 

The  error  in  the  concentration  estimates  was  found  to  be 
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where  6 c^,  6tq,  and  6k  are  the  errors  present  in  Cq,  r^,  and  K,  respectively. 
Recently,  Kalivas  (26)  showed  the  condition  number  of  the  K  matrix  is  a 
extremely  useful  tool  for  assessing  the  analytical  cost  in  terms  of  relative 
uncertainty  of  varying  sensor  selectivity.  Minimization  of  the  condition 
number  of  the  K  matrix  can  be  used  as  a  criteria  for  the  selection  of  the 
optimal  set  of  sensors  for  a  particular  multicomponent  analysis. 
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Moran  and  Kowalski  (27)  have  considered  the  statistical  aspects  of  the 
GSAM.  They  have  found  that  the  uncertainty  in  the  estimates  of  the  sensi¬ 
tivity  coefficients,  i .e.  the  k.  . ’s,  is  dependent  on  three  terms;  the 

* » J 

measurement  variance,  correlation  of  the  response  measurements  due  to  sub¬ 
traction  of  the  initial  response,  and  variance  arising  from  the  volume 
increase  as  a  result  of  making  standard  additions.  In  order  to  reduce  the 
variance  and  obtain  the  best  possible  accuracy  in  the  concentration  esti¬ 
mates,  they  recommend  several  steps.  First,  the  volume  increases  must  be 
minimized.  Second,  if  random  noise  is  the  dominant  source  of  error,  then  the 
total  difference  calculation  method  should  be  used.  Third,  the  largest 
possible  additions  of  analyte  should  be  made. 
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Quantitation 

Presuming  the  sensitivity  matrix,  K,  has  been  obtained,  the  quantitation 
step  can  be  approached  by  an  extension  of  the  single  component  model. 
Sternberg,  Sti I lo,  and  Schwendeman  (28)  have  described  the  application  of  the 
least  squares  method  in  matrix  form  to  the  spectrophotometr ic  analysis  of  a 
five  component  mixture.  They  noted  certain  restrictions  are  necessary  to 
assure  a  solution  to  the  matrix  problem  will  exist.  The  length  of  the 
response  vector,  r,  and  the  column  dimension  of  the  sensitivity  matrix,  K, 
must  be  equal  to  or  greater  than  the  number  of  analytes,  therefore  P  must  be 
greater  than  or  equal  to  N.  In  addition  the  rank  of  the  sensitivity  matrix 
must  be  at  least  N,  which  implies  the  P  sensors  must  span  a  minimum  of  an  N- 
dimensional  space.  If  there  are  exactly  the  same  number  of  sensors  as  there 
are  analytes  present,  i.e.  N  =  P,  then  the  solution  to  the  matrix  problem  is 
simply  given  as, 

c>  =  r*  K'1  (36) 

However,  if  more  sensors  than  the  minimum  number  necessary  to  obtain  a 
solution  to  the  system  of  linear  equations  are  used,  i.e.  P  >  N,  then  the 
method  of  least  squares  can  be  used  to  obtain  the  set  of  estimated  analyte 
concentrations  which  minimizes  the  difference  between  the  measured  responses 
and  the  responses  predicted  by  the  multicomponent  linear  model.  The  solution 
to  this  least  squares  problem  in  matrix  form  was  given  in  equation  12.  Two 
years  later,  Zschei le  and  co-workers  (29)  used  the  matrix  form  of  the  least 
squares  method  to  examine  a  four  component  spectrophotometr i c  system.  In 
analyzing  a  system  of  RNA-consti tuents ,  they  observed  the  stability  of  the 
concentration  estimates  was  very  dependent  on  the  wavelengths  selected  for 
the  analysis.  The  poor  stability  obtained  with  some  sets  of  wavelengths  was 


attributed  to  linear  dependence  of  the  underlying  pure  analyte  spectra.  The 
best  results  were  obtained  when  all  the  available  wavelengths  were  used. 

The  same  considerations  regarding  homogeneity  of  variance,  which  were 
necessary  for  the  single  component  linear  model,  must  also  be  made  when  the 
multicomponent  model  is  used.  Haaland  and  Easterling  (30)  applied  a  linear 
additive  multicomponent  model  to  the  analysis  of  infrared  spectra  of  xylene 
isomer  mixtures.  They  observed  the  noise  characteristics  of  most  infrared 
detectors  were  such  that  the  noise  was  generally  constant  and  independent  of 
the  signal  level.  The  signal  measured  by  these  detectors  is  in  transmit¬ 
tance,  which  is  then  converted  to  absorbance.  Since  Beer’s  law  is  generally 
obeyed  in  this  spectral  region  the  absorbance  is  directly  proportional  to 
concentration,  however,  the  precision  of  the  absorbance  measurements  are  not 
independent  of  the  measured  responses.  To  account  for  this  non-homogeneity, 
Haaland  and  Easterling  used  a  weighted  least  squares  procedure.  Expanding 
the  absorbance  signal  as  a  Taylor  series  about  the  transmittance  and  retain¬ 
ing  only  the  first  two  terms,  they  found  the  variance  of  the  noise  was  pro¬ 
portional  to  the  inverse  of  the  square  of  the  transmittance.  Therefore,  they 
performed  the  analysis  by  first  weighting  each  measured  response  in  the  spec¬ 
trum  by  a  factor  equal  to  its  transmittance  squared.  The  matrix  form  of  the 
weighted  least  squares  estimate  of  the  analyte  concentrations  is  given  by, 

c’  =  r’VV(K  V  K’f1  (37) 

where  the  matrix  Visa  diagonal  matrix  containing  the  reciprocal  of  the 
weights.  This  method  of  weighting  assumes  the  errors  in  the  responses  are 
independent  but  with  different  variances.  If  the  response  measurements  are 
correlated,  equation  37  may  still  be  used,  however,  the  matrix  V  is  no  longer 


Deterministic  Errors 


As  was  observed  with  the  single  component  model,  various  types  of  deter¬ 
ministic  errors  may  affect  the  multicomponent  linear  model.  These  errors, 
which  may  be  due  to  chemical,  e.g.  matrix  effects  or  interferences,  or  instru¬ 
mental  factors,  e.g.  drifting  or  non-zeroed  sensors,  result  in  violating  the 
assumptions  present  in  the  linear  additive  response  model.  In  two  recent 
papers  (31,32),  Kalivas  and  Kowalski  have  extended  the  GSAM  model  to  add  one 
or  more  terms  to  the  basic  model  which  allow  for  the  detection  and  correction 
of  sensor  drift  occur ing  during  the  course  of  the  analysis.  The  GSAM  model 
with  the  inclusion  of  terms  for  so-called  time  additions  is 
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where  W  is  the  polynomial  order  of  the  drift  model.  Volume  correction  was 
performed  as  earlier  described.  The  N+l-st  to  W-th  rows  of  the  K  matrix 
represent  the  coefficients  of  the  drift  model.  The  drift  coefficients  can  be 
examined  statistically  in  order  to  detect  the  presence  of  a  drifting  analyti¬ 
cal  sensor.  Estimation  of  the  initial  analyte  quantities  is  accomplished  as 
before,  after  deletion  from  the  K  matrix  of  the  rows  containing  the  sensiti¬ 
vity  coefficients  for  the  time  additions.  Implementation  of  the  drift  correc¬ 
ting  GSAM  model  requires  augmenting  the  AN  matrix  with  W  rows  containing  the 
time  elapsed  since  the  initial  response  measurement  raised  to  the  appropriate 
power.  This  was  accomplished  by  developing  a  completely  automated  system  for 
making  the  standard  additions,  measuring  the  responses,  and  recording  the 
elapsed  time  (32).  In  addition  to  implementing  the  time  additions  and  drift 
correction,  this  system  was  designed  to  make  the  standard  additions  by  weight 
instead  of  by  volume  in  order  to  minimize  the  relative  errors  in  measuring  the 
amount  of  analyte  added. 
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Implicit  in  the  multicomponent  linear  model  is  an  assumption  that  the 
response  of  the  analytical  sensors  can  be  zeroed.  Two  types  of  model  failure 
have  been  identified  in  connection  with  this  assumption:  first,  an  instru¬ 
mental  or  constant  background;  and  second,  a  sample  or  volume  dependent 
background.  Altering  the  multicomponent  model  to  compensate  for  an  instrumen¬ 
tal  background  can  be  accomplished  by  adding  a  constant  term  for  each  sensor, 

r *  =  c’K  ♦  d’  (39) 

where  the  vector  d  contains  the  background  contribution  at  each  of  the  P 
analytical  sensors.  Vandeginste  et  al.  (33)  has  shown  a  dilution  procedure 
can  be  used  to  correct  for  a  constant  background  response.  Equation  39  is 
rewritten  for  volume  corrected  responses  as 

Vo  =  VcoK  +  d)  =  "6K  +  V’  (40) 

where  Vq  is  the  initial  volume  of  the  sample  mixture,  r^  is  the  initial  re¬ 
sponse  vector,  Cq  is  the  initial  concentration  vector,  and  n^  is  the  vector  of 
initial  analyte  quantities.  A  standard  dilution  is  performed  by  adding  a 
volume,  Av,  of  pure  solvent  to  the  mixture  sample.  Equation  39  can  again  be 
rewritten  in  terms  of  the  volume  corrected  responses;  however,  now  the  total 
sample  volume  is  Vq  +  Av .  Since  the  absolute  quantities  of  analyte  present 
have  not  been  affected  by  the  dilution,  the  difference  in  the  volume  corrected 
responses,  Aq,  is  simply 

Aq’  =  q’  -  q£  =  Avd’  (42) 

This  relationship  allows  estimation  of  the  constant  background  vector,  d, 
since  it  is  a  function  of  only  the  added  volume,  Av,  and  the  vector  of 
measured  changes  in  the  volume  corrected  responses,  Aq. 


The  presence  of  additional  components  in  the  sample  mixture  gives  rise 
to  a  sample  or  volume  dependent  background.  This  can  be  incorporated  into 
the  standard  multicomponent  linear  model  by  adding  additional  terms  which 
express  the  response  as  a  function  of  the  known  analytes  and  the  additional 
interferents.  The  expanded  model  is 
N  T 

r,  =  E  c-k.,  +  E  c.k..  for  all  1=1, ...P  (43) 

1  i.i  1  11  j=i  j  >' 

where  rj  is  ths  response  of  the  l-th  sensor.  The  first  summation,  which  runs 
from  one  to  N,  accounts  for  the  response  caused  by  the  N  known  analytes.  The 
second  summation,  which  runs  from  one  to  T,  accounts  for  the  response  caused 
by  the  presence  of  the  T  interfering  components.  Since  the  identities  of  the 
T  interfering  components  are  not  known,  no  standards  for  these  components  can 
be  used  nor  can  standard  additions  of  these  components  be  made.  Hence,  during 
the  course  of  the  analysis,  the  relative  amounts  of  the  interferents  with 
respect  to  each  other  will  not  change.  Therefore,  these  T  interferents  can  be 
replaced  by  a  single  term  which  represents  their  combined  influence  on  the 
measured  sensor  responses, 

N 

r |  =  E  c.kjj  ♦  fj  for  all  l=l,...P  (44) 

i=l 

where  f|  is  the  combined  background  response  at  sensor  I.  The  important  dis¬ 
tinction  between  this  model  and  the  model,  given  in  equation  39  which  de¬ 
scribes  an  instrumental  background,  is  that  the  term  f|  is  a  function  of  the 
sample  volume,  therefore  the  standard  dilution  method  used  by  Vandeginste  does 
not  apply.  Since  the  sample  background,  f  j ,  is  not  known,  an  iterative  method 
must  be  used  to  perform  mixture  quantitation. 


In  their  original  paper  describing  the  GSAM  model,  Saxberg  and  Kowalski 
(24)  discussed  the  problem  of  analytical  sensors  which  were  not  zeroed.  They 
observed  that  if  the  background  response  was  small  relative  to  the  initial 
unknown  response  and  the  problem  was  reasonably  insensitive  to  perturbations, 
then  the  effect  on  the  finai  solution  can  be  expected  to  be  small. 

Leggett(34)  has  applied  non-negative  least  squares  regression  and  simplex 
optimization  to  multicomponent  spectrophotometr ic  data.  He  concluded  either 
of  these  methods  avoid  the  problem  of  negative  molar  absorpti v ities  or 
concentrations  which  are  sometimes  obtained  when  ordinary  least  squares 
regression  has  been  used.  This  conclusion  was  reached  with  the  stated 
assumption  that  the  correct  model,  e.g.  all  components  were  known,  had  been 
used  to  set  up  the  analysis.  Gayle  and  Bennett  (35)  carried  out  simulation 
studies  to  determine  the  effect  of  model  departure  in  multicomponent  analysis 
on  the  concentration  estimates  obtained  by  ordinary  least  squares  regression, 
non-negative  least  squares  regression,  and  linear  programming.  They  observed 
that  when  various  types  of  model  failure  were  simulated,  all  three  methods 
yielded  biased  results,  with  no  single  method  being  consistently  superior  to 
the  other  two.  In  addition,  of  the  three  methods  attempted  only  ordinary 
least  squares  provided  any  indication  that  the  model  was  not  valid.  Omission 
of  significant  terms  in  this  model  frequently  led  to  negative  analyte  concen¬ 
trations,  a  result  which  obviously  had  no  chemical  meaning.  However,  non¬ 
negative  least  squares  regression  and  linear  programming  yielded  results  which 
at  least  on  the  surface  seemed  chemically  plausable,  but  were  also  signifi¬ 
cantly  in  error. 

The  final  type  of  model  failure  which  may  occur  is  a  failure  of  the 
assumed  linear  relationship  between  analyte  concentration  and  the  measured 


response.  Apparent  deviations  from  the  ideal  behavior  described  by  the 
Beer-Lambert  law,  which  is  widely  used  in  spectrophotometric  analysis,  are 
well  known.  Saxberg  and  Kowalski  (24)  developed  the  original  GSAM  model  to 
allow  the  response  to  be  either  a  linear,  quadratic,  or  higher  degree 
polynomial  function  of  the  analyte  concentration.  Unfortunately,  as  the 
number  of  terms  in  the  model  increases,  so  does  the  required  number  of 
standard  additions  and  measurements  which  the  analyst  must  make.  An 
alternative  approach  has  been  used  by  Vandeginste  and  co-workers  (33)  involv¬ 
ing  the  application  of  a  mathematical  technique  known  as  Kalman  filtering  to 
provide  continuous  testing  of  the  validity  of  the  linear  model  during  the  data 
acquisition  stage  of  a  GSAM  experiment.  Poulisse  (36)  has  also  applied  the 
Kalman  filter  to  the  analysis  of  multicomponent  spectrophotometric  mixtures. 
Seelig  and  Blount  (37,38)  have  applied  this  method  to  anodic  stripping 
voltammetry  and  S.  Brown  and  co-workers  (39,40)  have  used  the  Kalman  filter 
with  linear  sweep  voltammetry  and  photoacoustic  spectroscopy.  This  filter 
relies  on  a  recursive  algorithim  which  constantly  updates  the  estimated 
sensitivities  as  more  standard  mixtures  are  analyzed.  The  recursive  nature  of 
this  filter,  which  has  only  recently  seen  application  in  analytical  chemistry, 
has  the  potential  of  providing  feedback  for  on  line  evaluation  and  optimiza¬ 
tion  of  the  calibration  process. 

Multicomponent  Bilinear  Models 

The  multicomponent  bilinear  model  is  obtained  when  a  second  measurement 
dimension  is  incorporated  into  the  multicomponent  linear  model.  This  model 
describes  the  response  of  a  single  mixture  sample  along  two  independent 
measurement  axes.  Important  applications  of  the  bilinear  model  in  analytical 
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chemistry  include  instrumental  techniques  based  on  two  spectroscopic  measure¬ 
ments,  e.g.  fluorescence  emission-excitation  matrices  or  EEMs,  and  techniques 
based  on  a  combination  of  chromatographic  and  spectroscopic  measurements,  e.g 
LC/UV,  GC/MS,  or  GC/FTIR  analyses.  The  response  of  a  single  component  can  be 
described  as 


M  =  a  x  y>  (45) 

where  kl  contains  the  measured  responses  and  is  the  outer  product  of  the  vec¬ 
tors  x  and  y  multiplied  by  a  concentration  dependent  factor,  a.  The  vector  x 
represents  the  spectral,  chromatographic,  or  temporal  profile  in  the  first 
dimension  and  y  represents  the  spectral,  chromatographic,  or  temporal  profile 
of  the  compound  in  the  second  dimension.  For  example,  a  GC/MS  peak  consisting 
of  50  mass  spectra  each  composed  of  20  distinct  m/e  ratios  would  result  in  a 
matrix  of  spectral  intensities,  M,  containing  50  rows  and  20  columns.  The 
vector  x  would  have  50  elements  and  describe  the  gas  chromatographic  elution 
profile.  The  vector  y  would  have  20  elements  and  represent  the  mass  spectrum 
of  the  pure  compound.  Normally,  x  and  y  are  normalized  to  a  length  of  one,  so 
that  the  factor  a  is  then  proportional  to  the  standard  concentration  of  the 
pure  compound.  Assuming  that  each  component  in  a  N  component  mixture  responds 
independently  of  the  remaining  N-l  components,  the  response  of  the  mixture  can 
be  represented  by 

N  N 

M  =  E  c.M.  =  l  c.(axy’) .  (46) 

i=l  '  1  i=l  '  ' 

where  is  the  standard  response  matrix  due  to  component  i  in  the  mixture 
and  Cj  is  the  concentration  of  the  i-th  component  divided  by  its  standard 


concentrati on . 


Least  Squares  Multiple  Regression 

The  analysis  of  the  data  obtained  from  an  experimental  system,  which  is 
described  by  the  multicomponent  bilinear  model,  depends  on  the  information 
available  to  the  analyst.  A  common  problem  is  the  quantitation  of  several 
components  whose  identities  are  known  and  whose  standard  matrices  are  avail¬ 
able.  In  this  situation,  least  squares  regression  may  be  used.  The  objec¬ 
tive  is  to  minimize  the  sum  of  the  squared  elements  of  the  residual  matrix, 
E,  which  is  defined  as 

Eij  =  “ij  * J/k(Mij)k  <47) 

where  the  parameters  are  the  amounts  of  each  of  the  k  compounds  present 
in  the  mixture.  Warner  et  al.  has  applied  this  method  to  the  analysis  of 
fluorescence  emission-excitation  matrices  (41).  The  least  squares  approach, 
while  easy  to  implement  and  conceptually  simple,  yields  accurate  results 
only  if  standards  of  all  of  the  mixture  components  are  included  in  the  data 
analysis. 

Rank  Ann i hi  I  at ion 

In  many  situations,  the  identities  of  all  components  contributing  to 
the  measured  response  may  not  be  known.  Ho  and  coworkers  have  developed  the 
method  of  rank  annihilation  to  allow  the  quantitation  of  one  or  several 
components  without  requiring  knowledge  of  all  of  the  components  in  a  mixture 
sample  (42,43).  They  have  applied  this  method  to  the  quantitative  analysis 
of  multicomponent  emission-excitation  matrices  (EEMs)  obtained  from  the 
analysis  of  polynuclear  aromatic  hydrocarbon  mixtures  with  the  video  fluor- 
ometer .  Ideally  the  mixture  matrix,  M,  should  have  a  rank  equal  to  the 
number  of  components,  N,  in  the  mixture.  For  a  mixture  of  N  components,  the 
best  least  squares  approximation  of  M  is  given  by 


VAVk  <48) 

k=l 

where 

Mvk  *  ^kuk  <49> 

and 

«'uk  =  Ckvk  (so) 

The  eigenvectors  {Uj,...Ujg}  and  {vj,...Vjg}  should  span  the  same  vector 
spaces  as  the  pure  component  vectors  {Xj,...X|g}  and  The  number 

of  nonzero  eignevalues,  £k,  equals  the  number  of  components  in  the  sample. 
In  order  to  perform  rank  annihilation  an  amount,  p,  of  the  standard  matrix, 
Mj,  which  corresponds  to  a  component  known  to  be  present  in  the  mixture,  is 
subtracted  from  the  mixture  matrix,  M,  to  yield 

E  =  M  -  /JMj  (51) 

When  the  correct  value  of  p,  corresponding  to  the  concentration  of  in  M, 
has  been  subtracted  the  rank  of  EE’  will  be  N-l .  This  is  indicated  by  one 
of  the  nonzero  eigenvalues  in  EE*  approaching  zero.  Since  real  data  con¬ 
tains  experimental  error,  the  eigenvalue  does  not  become  exactly  zero,  but 
it  does  have  a  distinct  minimum.  The  advantages  of  this  technique  are  that 
it  does  not  require  the  knowledge  of  all  of  the  sample  constituents  or  the 
presence  of  selective  spectral  regions. 

If  quantitation  of  several  known  species  in  a  multicomponent  bilinear 
mixture  are  desired,  then  an  extension  of  rank  annihilation  based  on  the 
Fletcher-Powel I  algorithm  may  be  used  (44),  This  algorithm  allows  simul¬ 
taneous  computation  of  the  concentrations  of  all  known  components  in  the 
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mixture  sample.  McCue  and  Malinowski  have  used  rank  annihilation  of  UV 
absorbance  spectra  to  quantify  coeluting  liquid  chromatographic  peaks  (45). 
Applying  rank  annihilation  to  LC/UV  data  requires  that  the  elution  profiles 
of  each  individual  component  in  the  mixture  are  exactly  reproducible  between 
the  chromatographed  standard  samples  and  the  mixture. 

Self  Modeling  Curve  Resolution 

In  1971,  Lawton  and  Sylvester  (46)  reported  a  method,  which  they  termed 
self  modeling  curve  resolution,  for  resolving  two  unknown  overlapping  func¬ 
tions  from  an  observed  set  of  mixtures  of  the  two  functions.  They  noted 
that  this  type  of  problem  arises  frequently  in  areas  such  as  chromatography 
and  spectrophotometry.  This  method  is  based  on  the  assumption  that  neither 
the  identities  of  the  individual  components  nor  their  responses  are  known, 
but  the  responses  for  a  number  of  mixtures  of  varing  relative  amounts  of  the 
same  underlying  components  have  been  measured.  The  objectives  of  self 
modeling  curve  resolution  are  two-fold:  first,  to  estimate  the  spectra  of 
the  underlying  pure  components;  and  second  to  quantify  the  amount  of  each 
pure  component  present  in  a  given  mixture.  The  model  developed  by  Lawton 
and  Sylvester  can  be  described  as  follows.  The  measured  response  of  a 
mixture  of  two  pure  components  can  be  expressed  as  the  sum  of  the  responses 
of  the  individual  components.  This  is  simply  the  two  component  case  of  the 
multicomponent  linear  model  developed  in  the  last  section  and  can  be  written 

m  =  Xjyj  +  x2y2  (52) 

where  m  represents  a  single  mixture  spectrum,  x^  and  x2  are  the  concentrations 
of  the  two  pure  components, and  the  vectors  y^  and  y2  are  the  spectra  of  the 
pure  components.  Normalization  of  the  pure  component  spectra  does  not  re¬ 
strict  the  shape  of  the  unknown  spectra.  The  concentrations  x^  and  x2  are  now 
defined  relative  to  the  concentration  of  analyte  which  produces  an  absorbance 


spectrum  of  unit  area.  If  N  different  mixture  samples  of  these  two  components 
are  measured,  then  the  entire  data  set  can  be  expressed  in  matrix  form  as 

M  =  X  Y  (53) 

where  M  is  a  N  x  P  matrix  of  measured  responses,  X  is  a  N  x  2  matrix  of  anal¬ 
yte  concentrations,  and  Y  is  a  2  x  P  matrix  of  analyte  spectra  scaled  to  unit 
area.  Since  only  two  components  are  present,  each  observed  mixture  spectrum, 
e.g.  each  row  of  M,  can  be  expressed  as  a  linear  combination  of  the  first  two 
eigenvectors  of  the  second  moment  matrix,  M *M/N .  That  is 


m. 

i 


CilVl 


ei2V2 


(54) 


where  m.  is  the  i-th  mixture  spectrum  and  and  Vg  are  the  eigenvectors 
associated  with  the  two  largest  eigenvalues  of  M’M/N.  The  spectra,  y^  and 
y^,  of  the  two  pure  components  must  also  be  linear  combinations  of  these  two 
eigenvectors . 


Xj  =  -  >7i2V2  for  i  =1 , 2  (55) 

Determination  of  the  values  of  rj  ^  and  »7  -  ^  is  equivalent  to  estimation  of  the 
unknown  pure  spectra. 


Lawton  and  Sylvester  applied  three  restrictions  in  order  to  obtain  physi¬ 
cally  meaningful  estimates  of  the  pure  spectra,  y^  and  y^  The  first  re¬ 
striction  was  that  all  elements  of  the  unknown  pure  spectra  must  be  non¬ 
negative.  This  implies  that  rj .  ^  and  rj must  satisfy 


7 , i v i k  ♦  ? j 2v2k  -  0  for  al1  k=1<  -  P 


(56) 


where  v-, 
Jk 

chemica I 


is  the  k-th  element  of  eigenvector  .  This  is  equivalent  in  a 
sense  to  not  allowing  negative  absorbances.  The  second  restriction 
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was  that  all  of  the  mixture  spectra  must  be  composed  of  non-negative  amounts 


of  the  two  pure  components.  This  requires  x.j  >  0  for  all 
equation  52,  54,  and  55,  it  can  be  shown  this  restriction 
requiring  x.j  >  0  and  x . ^  >  0  in 


i  and  j  .  From 
s  equivalent  to 


(C  i  1 ' 6  i  2^  =  xi  1^11 '^12^  +  xi2^21’^22^  ^or  i=l,.*.S 


(57) 


The  final  restriction  was  based  on  the  assumption  that  the  unknown  spectra, 
y.,  have  been  normalized  to  unit  area.  Figure  1  illustrates  these  three 
restrictions  plotted  in  the  2-dimensional  eigenvector  space  {V^Vg}.  The 
angle  formed  by  the  inner  constraint  in  figure  1  represents  the  range  of 
relative  analyte  concentrations  within  the  set  of  mixture  samples.  The 
angle  formed  by  the  outer  constraint  is  related  to  the  spectral  uniqueness 
of  the  two  pure  components.  Without  requiring  any  assumptions  as  to  the 
shape  of  the  spectral  curves,  two  regions,  Fj  and  Fjj,  which  contain  the 
eigenvector  representation  of  the  pure  spectra,  and  y^,  were  obtained. 


Sharaf  and  Kowalski  (47,48)  have  considered  the  problem  of  quantitation 
in  the  two  dimensional  eigenvector  space.  They  have  shown  that  quantitative 
resolution  of  the  two  components  in  any  given  mixture  spectrum  is  a  straight 
foward  function  of  the  relative  positions  of  the  two  pure  spectra  and  the 
mixture  spectrum  in  the  eigenvector  space.  Assuming  the  mixture  spectrum 
haq  been  normalized  to  unit  area,  it  will  fall  somewhere  along  the  line 
segment  separating  regions  Fj  and  Fjj  in  figure  1.  In  order  to  quantify  a 
mixture  spectrum,  the  positions  of  the  pure  spectra,  y^  and  within  the 
regions  Fj  and  Fjj,  respectively,  must  be  known  or  estimated.  If  point  m  in 
region  Fj  is  selected  to  represent  the  pure  spectrum  of  component  1  and 
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point  n  in  region  Fjj  is  selected  to  represent  the  pure  spectrum  of  compon¬ 
ent  2;  then  Sharaf  and  Kowalski  proved  the  fraction  of  the  total  response  of 
mixture  i  due  to  component  1,  F .  ^ ,  is  given  by 

F.,  =  d  •  /  d  (58) 

il  ni  '  mn  ' 

where  d  .  is  the  euclidean  distance  from  point  n  to  mixture  i  and  d  is  the 
n  i  mn 

distance  from  point  m  in  region  Fj  to  point  n  in  region  Fjj-  The  analogous 
expression  for  the  fraction  of  the  total  response  of  mixture  is  due  to  compon¬ 
ent  2,  F . ^ ,  is  given  as 

F.0  =  d  ■  /  d  (59) 

The  major  problem  to  be  addressed  in  quantitating  mixture  spectra  is  the 
selection  of  the  points  m  and  n  to  be  used  as  the  best  estimates  of  the  pure 
spectra,  y^  and  y^.  Sharaf  and  Kowalski  considered  several  possibilities. 
First,  if  the  width  of  the  solution  bands,  F^  and  F^j,  are  equal  to  zero, 
then  pure  spectra  of  both  components  have  been  measured  and  at  least  one 
specific  sensor  (e.g.  wavelength,  mass/charge  ratio)  exists  for  each  compon¬ 
ent.  In  this  case  no  assumptions  are  necessary  to  correctly  quantify  the 
mixture  spectra.  Second,  if  the  solution  band  widths  are  not  zero  but  speci¬ 
fic  sensors  are  known  to  exist,  then  all  measured  samples  are  mixtures  of 
both  components.  Since  specific  sensors  are  known,  the  outer  edges  of  the 
solution  bands  are  the  correct  choice  for  the  estimates  of  the  pure  component 
spectra.  Third,  if  the  solution  band  widths  are  not  zero  and  specific 
sensors  are  not  known  to  exist,  then  some  assumptions  must  be  made  in  order 
to  quantify  the  mixture.  The  authors  recommended  using  the  inner  edges  of 
the  solution  bands,  e.g.  the  purest  spectra  recorded,  as  an  initial  estimate 
of  the  pure  component  spectra.  Alternately,  the  mid-points  of  each  region 
may  be  used  in  the  absence  of  further  information. 


A  similar  approach  to  curve  resolution  has  been  used  by  Martens  (49). 

The  major  difference  in  the  model  used  by  Martens  compared  to  that  used  by 
Lawton  and  Sylvester  is  that  prior  to  extracting  the  eigenvectors  of  the 
moment  matrix,  Martens  normalized  the  mixture  spectra  to  constant  area  and 
centered  the  data  matrix  by  subtracting  the  mean  response  of  each  sensor. 

This  resulted  in  one  less  eigenvector  being  required  to  represent  the  mixture 
spectra  in  the  reduced  eigenvector  space.  Therefore,  a  mixture  spectrum  con¬ 
taining  two  underlying  components  can  be  represented  by  a  linear  combination 
of  the  mean  and  the  first  eigenvector  of  the  centered  covariance  matrix.  The 
advantage  of  this  additional  step  is  two-fold:  first,  one  less  dimension  is 
necessary  to  represent  the  data,  hence  the  factor  analysis  solution  is  some¬ 
what  easier  to  interpret;  and  second,  the  large  trival  variance  associated 
with  the  mean  has  been  removed  by  centering  the  data.  Martens  has  made  the 
same  assumptions  as  were  used  by  Lawton  and  Sylvester:  first,  only  non¬ 
negative  responses  are  allowed;  second,  only  non-negative  quantities  of  anal¬ 
ytes  may  be  present;  and  third,  the  pure  component  spectra  are  scaled  to  con¬ 
stant  area.  Spj^tvol I ,  Martens,  and  Volden  (15)  have  compared  the  constraint 
equations  for  the  two  dimensional  case  using  the  mean  plus  one  eigenvector 
model  to  the  constraint  equations  as  formulated  by  Lawton  and  Sylvester. 

When  the  mean  plus  one  eigenvector  model  is  used,  quantitation  can  be 
accomplished  using  the  method  described  by  Sharaf  and  Kowalski  (48).  Osten 
and  Kowalski  (50)  have  recently  examined  the  quantitative  accuracy  of  self 
modeling  curve  resolution  for  the  analysis  of  UV  absorbance  data  obtained 
from  a  diode  array  high  performance  liquid  chromatography  detector. 

Warner  et  al.  (51)  have  used  an  approach  similar  to  curve  resolution 
which  is  based  on  the  ei genana I ysi s  of  fluorescence  emission-excitation 
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matrices.  This  method  makes  use  of  the  same  assumptions  as  the  Lawton  and 
Sylvester  approach,  only  non-negative  responses  and  non-negative  quantities 
of  each  component  are  permitted.  Warner  and  coworkers  have  relaxed  these 
constraints  allowing  some  elements  to  be  slightly  below  zero  in  order  to 
account  for  noise  in  the  experimental  data.  Since  the  EEM  represents  data 
involving  two  spectral  dimensions,  they  have  considered  the  uncertainties  in 
the  estimated  spectra  for  differing  combinations  of  spectral  overlap  involv¬ 
ing  either  one  or  both  spectral  dimensions  between  the  two  pure  components. 

The  problem  of  generalizing  curve  resolution  from  the  2  component  situa¬ 
tions  described  above  to  the  N  component  case  is  not  trivial.  Martens  (49) 
examined  the  problem  of  three  component  mixtures  of  cereal  amino  acids.  Ohta 
(52)  has  shown  the  solution  of  the  3-components  problem  for  a  mixture  of 
photographic  dyes.  Very  recently,  Borgen  and  Kowalski  (53)  have  described  a 
general  solution  for  the  N-component  resolution  case.  In  all  of  these  situa¬ 
tions,  the  same  non-negative  quantity  and  non-negative  response  constraints 
have  been  uti I i zed. 

The  multivariate  methods  discussed  can  be  used  to  improve  the  precision 
and  accuracy  of  an  analytical  procedure.  The  widespread  incorporation  of 
microprocessors  in  analaytical  instrumentation  can  inundate  the  chemist  with 
raw  data.  In  order  to  obtain  valid  chemical  information  from  this  wealth  of 
data,  the  analyst  must  consider  not  only  the  chemical  system  under  evaluation 
but  also  the  advantages,  disadvantages,  limits,  and  assumptions  inherent  in 
various  potential  data  anlaysis  approaches. 
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Table  Is  Classification  of  Unconstrained  Additive  Mixtures 


Class 

Concentrations 

Spectra 

1A 

known 

known 

IB 

known 

unknown 

2A 

unknown 

known 

2B 

unknown 

unknown 
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Generalized  plot  of  two  dimensional  eignevector 
space.  The  outer  edges  of  the  shaded  region 
represent  the  non-negative  response  constraint. 

The  inner  edges  represent  the  non-negative  quantity 
constraint.  The  regions  F^.  and  F  are  the 
are  the  allowable  regions  for  the  location 
of  the  pure  spectra  m  and  n. 
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